Overview

Dataset statistics

Number of variables11
Number of observations22191
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.9 MiB
Average record size in memory88.0 B

Variable types

Numeric11

Alerts

AT is highly correlated with AH and 1 other fieldsHigh correlation
AH is highly correlated with ATHigh correlation
AFDP is highly correlated with GTEP and 3 other fieldsHigh correlation
GTEP is highly correlated with AFDP and 4 other fieldsHigh correlation
TIT is highly correlated with AFDP and 4 other fieldsHigh correlation
TEY is highly correlated with AFDP and 4 other fieldsHigh correlation
CDP is highly correlated with AFDP and 4 other fieldsHigh correlation
CO is highly correlated with GTEP and 3 other fieldsHigh correlation
NOX is highly correlated with ATHigh correlation
AT is highly correlated with AH and 1 other fieldsHigh correlation
AH is highly correlated with ATHigh correlation
AFDP is highly correlated with GTEP and 3 other fieldsHigh correlation
GTEP is highly correlated with AFDP and 5 other fieldsHigh correlation
TIT is highly correlated with AFDP and 4 other fieldsHigh correlation
TAT is highly correlated with GTEP and 2 other fieldsHigh correlation
TEY is highly correlated with AFDP and 5 other fieldsHigh correlation
CDP is highly correlated with AFDP and 5 other fieldsHigh correlation
CO is highly correlated with GTEP and 3 other fieldsHigh correlation
NOX is highly correlated with ATHigh correlation
AFDP is highly correlated with GTEP and 1 other fieldsHigh correlation
GTEP is highly correlated with AFDP and 4 other fieldsHigh correlation
TIT is highly correlated with AFDP and 4 other fieldsHigh correlation
TEY is highly correlated with GTEP and 3 other fieldsHigh correlation
CDP is highly correlated with GTEP and 3 other fieldsHigh correlation
CO is highly correlated with GTEP and 3 other fieldsHigh correlation
AT is highly correlated with AP and 7 other fieldsHigh correlation
AP is highly correlated with ATHigh correlation
AH is highly correlated with ATHigh correlation
AFDP is highly correlated with GTEP and 5 other fieldsHigh correlation
GTEP is highly correlated with AT and 7 other fieldsHigh correlation
TIT is highly correlated with AT and 7 other fieldsHigh correlation
TAT is highly correlated with AT and 5 other fieldsHigh correlation
TEY is highly correlated with AT and 7 other fieldsHigh correlation
CDP is highly correlated with AT and 7 other fieldsHigh correlation
CO is highly correlated with AFDP and 5 other fieldsHigh correlation
NOX is highly correlated with AT and 5 other fieldsHigh correlation

Reproduction

Analysis started2022-07-01 15:14:45.627541
Analysis finished2022-07-01 15:15:18.930121
Duration33.3 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

AT
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct15988
Distinct (%)72.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.71224675
Minimum0.28985
Maximum34.929
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size173.5 KiB

Quantile statistics

Minimum0.28985
5-th percentile5.99275
Q111.6645
median17.739
Q323.657
95-th percentile29.399
Maximum34.929
Range34.63915
Interquartile range (IQR)11.9925

Descriptive statistics

Standard deviation7.352788794
Coefficient of variation (CV)0.4151245688
Kurtosis-0.9509019253
Mean17.71224675
Median Absolute Deviation (MAD)6.006
Skewness0.008794502091
Sum393052.4676
Variance54.06350305
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11.926
 
< 0.1%
10.9926
 
< 0.1%
23.9696
 
< 0.1%
12.0686
 
< 0.1%
25.5976
 
< 0.1%
11.2886
 
< 0.1%
18.0916
 
< 0.1%
24.2475
 
< 0.1%
14.6625
 
< 0.1%
17.2035
 
< 0.1%
Other values (15978)22134
99.7%
ValueCountFrequency (%)
0.289851
< 0.1%
0.382891
< 0.1%
0.50371
< 0.1%
0.52231
< 0.1%
0.587591
< 0.1%
0.603941
< 0.1%
0.767441
< 0.1%
0.789071
< 0.1%
0.831011
< 0.1%
0.864331
< 0.1%
ValueCountFrequency (%)
34.9291
< 0.1%
34.9031
< 0.1%
34.8311
< 0.1%
34.7481
< 0.1%
34.6651
< 0.1%
34.6191
< 0.1%
34.5981
< 0.1%
34.5321
< 0.1%
34.4911
< 0.1%
34.4831
< 0.1%

AP
Real number (ℝ≥0)

HIGH CORRELATION

Distinct670
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1012.812607
Minimum985.85
Maximum1034.2
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size173.5 KiB

Quantile statistics

Minimum985.85
5-th percentile1002.8
Q11008.8
median1012.4
Q31016.7
95-th percentile1024
Maximum1034.2
Range48.35
Interquartile range (IQR)7.9

Descriptive statistics

Standard deviation6.396587838
Coefficient of variation (CV)0.006315667671
Kurtosis0.3759747746
Mean1012.812607
Median Absolute Deviation (MAD)3.9
Skewness0.06007877594
Sum22475324.56
Variance40.91633597
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1012.1191
 
0.9%
1010.8188
 
0.8%
1011.8188
 
0.8%
1011.9177
 
0.8%
1012177
 
0.8%
1011.4173
 
0.8%
1011.1172
 
0.8%
1012.4170
 
0.8%
1013.6169
 
0.8%
1010.9168
 
0.8%
Other values (660)20418
92.0%
ValueCountFrequency (%)
985.851
< 0.1%
986.161
< 0.1%
986.251
< 0.1%
986.412
< 0.1%
986.431
< 0.1%
986.561
< 0.1%
986.781
< 0.1%
986.871
< 0.1%
987.311
< 0.1%
987.431
< 0.1%
ValueCountFrequency (%)
1034.21
 
< 0.1%
10341
 
< 0.1%
1033.91
 
< 0.1%
1033.42
 
< 0.1%
1033.21
 
< 0.1%
10336
< 0.1%
1032.82
 
< 0.1%
1032.61
 
< 0.1%
1032.41
 
< 0.1%
1032.31
 
< 0.1%

AH
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct17316
Distinct (%)78.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean79.55522437
Minimum27.504
Maximum100.2
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size173.5 KiB

Quantile statistics

Minimum27.504
5-th percentile52.9385
Q170.2945
median82.781
Q390.532
95-th percentile97.5305
Maximum100.2
Range72.696
Interquartile range (IQR)20.2375

Descriptive statistics

Standard deviation13.91501847
Coefficient of variation (CV)0.1749101782
Kurtosis-0.2243244766
Mean79.55522437
Median Absolute Deviation (MAD)9.252
Skewness-0.7179519553
Sum1765409.984
Variance193.6277391
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100.1446
 
0.2%
100.1246
 
0.2%
100.1542
 
0.2%
100.1642
 
0.2%
100.1138
 
0.2%
100.1333
 
0.1%
100.1727
 
0.1%
100.0922
 
0.1%
100.120
 
0.1%
100.0613
 
0.1%
Other values (17306)21862
98.5%
ValueCountFrequency (%)
27.5041
< 0.1%
30.3441
< 0.1%
30.8991
< 0.1%
31.2041
< 0.1%
31.9641
< 0.1%
32.6171
< 0.1%
32.7891
< 0.1%
32.7921
< 0.1%
33.0231
< 0.1%
33.2641
< 0.1%
ValueCountFrequency (%)
100.24
 
< 0.1%
100.191
 
< 0.1%
100.185
 
< 0.1%
100.1727
0.1%
100.1642
0.2%
100.1542
0.2%
100.1446
0.2%
100.1333
0.1%
100.1246
0.2%
100.1138
0.2%

AFDP
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct15228
Distinct (%)68.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.037750016
Minimum2.0874
Maximum7.6106
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size173.5 KiB

Quantile statistics

Minimum2.0874
5-th percentile2.74735
Q13.44985
median4.0688
Q34.4514
95-th percentile5.54575
Maximum7.6106
Range5.5232
Interquartile range (IQR)1.00155

Descriptive statistics

Standard deviation0.8102228958
Coefficient of variation (CV)0.2006619758
Kurtosis0.168933672
Mean4.037750016
Median Absolute Deviation (MAD)0.4792
Skewness0.3755179931
Sum89601.7106
Variance0.6564611409
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4.12867
 
< 0.1%
4.50327
 
< 0.1%
4.09346
 
< 0.1%
4.23226
 
< 0.1%
4.10246
 
< 0.1%
4.12566
 
< 0.1%
4.43616
 
< 0.1%
4.48166
 
< 0.1%
4.42736
 
< 0.1%
3.88376
 
< 0.1%
Other values (15218)22129
99.7%
ValueCountFrequency (%)
2.08741
< 0.1%
2.09921
< 0.1%
2.10571
< 0.1%
2.11971
< 0.1%
2.13951
< 0.1%
2.14411
< 0.1%
2.15971
< 0.1%
2.16731
< 0.1%
2.1851
< 0.1%
2.18661
< 0.1%
ValueCountFrequency (%)
7.61061
< 0.1%
7.55491
< 0.1%
7.31891
< 0.1%
7.23991
< 0.1%
6.98311
< 0.1%
6.97791
< 0.1%
6.9561
< 0.1%
6.93121
< 0.1%
6.9271
< 0.1%
6.92591
< 0.1%

GTEP
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct9827
Distinct (%)44.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.31787252
Minimum17.878
Maximum37.402
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size173.5 KiB

Quantile statistics

Minimum17.878
5-th percentile19.239
Q122.736
median24.989
Q326.839
95-th percentile32.898
Maximum37.402
Range19.524
Interquartile range (IQR)4.103

Descriptive statistics

Standard deviation4.234147408
Coefficient of variation (CV)0.1672394632
Kurtosis-0.6488366776
Mean25.31787252
Median Absolute Deviation (MAD)2.049
Skewness0.3899686595
Sum561828.909
Variance17.92800428
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24.30813
 
0.1%
25.26912
 
0.1%
25.22612
 
0.1%
25.00611
 
< 0.1%
24.26311
 
< 0.1%
20.29610
 
< 0.1%
25.55210
 
< 0.1%
24.39110
 
< 0.1%
25.48710
 
< 0.1%
23.89310
 
< 0.1%
Other values (9817)22082
99.5%
ValueCountFrequency (%)
17.8781
< 0.1%
17.9121
< 0.1%
17.9391
< 0.1%
17.9661
< 0.1%
17.9741
< 0.1%
18.0281
< 0.1%
18.0371
< 0.1%
18.0391
< 0.1%
18.0651
< 0.1%
18.0791
< 0.1%
ValueCountFrequency (%)
37.4021
< 0.1%
37.341
< 0.1%
37.1891
< 0.1%
37.1721
< 0.1%
37.0681
< 0.1%
36.9731
< 0.1%
36.9591
< 0.1%
36.951
< 0.1%
36.9171
< 0.1%
36.8441
< 0.1%

TIT
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct722
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1083.08028
Minimum1000.8
Maximum1100.8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size173.5 KiB

Quantile statistics

Minimum1000.8
5-th percentile1052.1
Q11074.6
median1088.1
Q31095.3
95-th percentile1100.1
Maximum1100.8
Range100
Interquartile range (IQR)20.7

Descriptive statistics

Standard deviation16.84076501
Coefficient of variation (CV)0.01554895358
Kurtosis0.0004342854174
Mean1083.08028
Median Absolute Deviation (MAD)9.1
Skewness-1.025935614
Sum24034634.5
Variance283.611366
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11001348
 
6.1%
1099.91059
 
4.8%
1100.1816
 
3.7%
1099.8493
 
2.2%
1100.2447
 
2.0%
1100.3249
 
1.1%
1099.7221
 
1.0%
1099.6121
 
0.5%
1085.5119
 
0.5%
1090116
 
0.5%
Other values (712)17202
77.5%
ValueCountFrequency (%)
1000.81
< 0.1%
1001.31
< 0.1%
1001.42
< 0.1%
1009.51
< 0.1%
1018.31
< 0.1%
1022.11
< 0.1%
1023.91
< 0.1%
1024.41
< 0.1%
1024.51
< 0.1%
1024.61
< 0.1%
ValueCountFrequency (%)
1100.81
 
< 0.1%
1100.63
 
< 0.1%
1100.515
 
0.1%
1100.485
 
0.4%
1100.3249
 
1.1%
1100.2447
 
2.0%
1100.1816
3.7%
11001348
6.1%
1099.91059
4.8%
1099.8493
 
2.2%

TAT
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct2587
Distinct (%)11.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean545.5201699
Minimum512.45
Maximum550.61
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size173.5 KiB

Quantile statistics

Minimum512.45
5-th percentile528.35
Q1542.6
median549.9
Q3550.05
95-th percentile550.28
Maximum550.61
Range38.16
Interquartile range (IQR)7.45

Descriptive statistics

Standard deviation7.708707885
Coefficient of variation (CV)0.01413093101
Kurtosis0.8743698528
Mean545.5201699
Median Absolute Deviation (MAD)0.24
Skewness-1.498289663
Sum12105638.09
Variance59.42417725
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
550436
 
2.0%
550.01425
 
1.9%
549.98416
 
1.9%
549.99409
 
1.8%
549.97403
 
1.8%
549.96402
 
1.8%
550.03396
 
1.8%
550.04385
 
1.7%
550.02380
 
1.7%
549.95369
 
1.7%
Other values (2577)18170
81.9%
ValueCountFrequency (%)
512.451
< 0.1%
512.62
< 0.1%
513.061
< 0.1%
513.091
< 0.1%
513.171
< 0.1%
513.291
< 0.1%
513.471
< 0.1%
513.751
< 0.1%
514.31
< 0.1%
514.431
< 0.1%
ValueCountFrequency (%)
550.611
 
< 0.1%
550.571
 
< 0.1%
550.561
 
< 0.1%
550.532
 
< 0.1%
550.521
 
< 0.1%
550.511
 
< 0.1%
550.51
 
< 0.1%
550.492
 
< 0.1%
550.485
< 0.1%
550.471
 
< 0.1%

TEY
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct5013
Distinct (%)22.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean133.5373931
Minimum100.17
Maximum174.61
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size173.5 KiB

Quantile statistics

Minimum100.17
5-th percentile109.23
Q1124.26
median133.77
Q3138.645
95-th percentile161.935
Maximum174.61
Range74.44
Interquartile range (IQR)14.385

Descriptive statistics

Standard deviation16.02610712
Coefficient of variation (CV)0.1200121311
Kurtosis-0.5534286352
Mean133.5373931
Median Absolute Deviation (MAD)7.74
Skewness0.1453197274
Sum2963328.29
Variance256.8361095
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
133.78173
 
0.8%
133.74167
 
0.8%
133.67154
 
0.7%
133.76154
 
0.7%
133.79143
 
0.6%
133.75132
 
0.6%
133.73131
 
0.6%
133.72130
 
0.6%
133.68127
 
0.6%
133.77126
 
0.6%
Other values (5003)20754
93.5%
ValueCountFrequency (%)
100.171
< 0.1%
100.321
< 0.1%
100.521
< 0.1%
100.831
< 0.1%
100.961
< 0.1%
101.151
< 0.1%
101.481
< 0.1%
101.621
< 0.1%
101.661
< 0.1%
101.711
< 0.1%
ValueCountFrequency (%)
174.611
< 0.1%
174.41
< 0.1%
174.251
< 0.1%
173.921
< 0.1%
173.431
< 0.1%
173.261
< 0.1%
172.971
< 0.1%
172.961
< 0.1%
172.542
< 0.1%
172.151
< 0.1%

CDP
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3977
Distinct (%)17.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.06020766
Minimum9.8754
Maximum15.081
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size173.5 KiB

Quantile statistics

Minimum9.8754
5-th percentile10.405
Q111.395
median12.001
Q312.4435
95-th percentile14.035
Maximum15.081
Range5.2056
Interquartile range (IQR)1.0485

Descriptive statistics

Standard deviation1.114264876
Coefficient of variation (CV)0.09239184829
Kurtosis-0.6361301072
Mean12.06020766
Median Absolute Deviation (MAD)0.532
Skewness0.2693521219
Sum267628.0682
Variance1.241586215
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12.12232
 
0.1%
12.16531
 
0.1%
11.90829
 
0.1%
11.79528
 
0.1%
11.89128
 
0.1%
11.93428
 
0.1%
11.8328
 
0.1%
12.04828
 
0.1%
12.0227
 
0.1%
11.83927
 
0.1%
Other values (3967)21905
98.7%
ValueCountFrequency (%)
9.87541
< 0.1%
9.90441
< 0.1%
9.92861
< 0.1%
9.94281
< 0.1%
9.95911
< 0.1%
9.96411
< 0.1%
9.9691
< 0.1%
9.97591
< 0.1%
9.98521
< 0.1%
9.98541
< 0.1%
ValueCountFrequency (%)
15.0811
< 0.1%
15.0551
< 0.1%
15.0431
< 0.1%
15.0311
< 0.1%
15.0021
< 0.1%
14.9761
< 0.1%
14.9581
< 0.1%
14.9131
< 0.1%
14.9081
< 0.1%
14.8721
< 0.1%

CO
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct18104
Distinct (%)81.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.214390083
Minimum0.00038751
Maximum44.103
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size173.5 KiB

Quantile statistics

Minimum0.00038751
5-th percentile0.375205
Q10.995375
median1.5242
Q32.5424
95-th percentile6.1675
Maximum44.103
Range44.10261249
Interquartile range (IQR)1.547025

Descriptive statistics

Standard deviation2.295746499
Coefficient of variation (CV)1.036739876
Kurtosis52.88783532
Mean2.214390083
Median Absolute Deviation (MAD)0.64708
Skewness4.932390189
Sum49139.53033
Variance5.270451988
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.23986
 
< 0.1%
1.33186
 
< 0.1%
1.57775
 
< 0.1%
1.32045
 
< 0.1%
1.32415
 
< 0.1%
1.22885
 
< 0.1%
1.57245
 
< 0.1%
1.3675
 
< 0.1%
1.43415
 
< 0.1%
1.07745
 
< 0.1%
Other values (18094)22139
99.8%
ValueCountFrequency (%)
0.000387511
< 0.1%
0.00159351
< 0.1%
0.00366531
< 0.1%
0.00503341
< 0.1%
0.00611481
< 0.1%
0.0075051
< 0.1%
0.00893131
< 0.1%
0.010961
< 0.1%
0.0134571
< 0.1%
0.016441
< 0.1%
ValueCountFrequency (%)
44.1031
< 0.1%
43.6221
< 0.1%
43.4281
< 0.1%
43.3971
< 0.1%
39.051
< 0.1%
37.7461
< 0.1%
35.0451
< 0.1%
35.0191
< 0.1%
34.4961
< 0.1%
34.4671
< 0.1%

NOX
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct16359
Distinct (%)73.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean68.77652873
Minimum27.765
Maximum119.91
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size173.5 KiB

Quantile statistics

Minimum27.765
5-th percentile53.694
Q161.548
median67.096
Q374.572
95-th percentile87.1325
Maximum119.91
Range92.145
Interquartile range (IQR)13.024

Descriptive statistics

Standard deviation11.03623121
Coefficient of variation (CV)0.1604650804
Kurtosis2.470670204
Mean68.77652873
Median Absolute Deviation (MAD)6.364
Skewness1.113054702
Sum1526219.949
Variance121.7983994
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
60.38104
 
0.5%
64.1097
 
< 0.1%
66.0126
 
< 0.1%
61.5796
 
< 0.1%
66.6966
 
< 0.1%
77.176
 
< 0.1%
66.2646
 
< 0.1%
61.4896
 
< 0.1%
60.2956
 
< 0.1%
56.6446
 
< 0.1%
Other values (16349)22032
99.3%
ValueCountFrequency (%)
27.7651
< 0.1%
41.7771
< 0.1%
42.0931
< 0.1%
43.1981
< 0.1%
43.2261
< 0.1%
43.2421
< 0.1%
43.2471
< 0.1%
43.2741
< 0.1%
43.4711
< 0.1%
43.71
< 0.1%
ValueCountFrequency (%)
119.911
< 0.1%
119.91
< 0.1%
119.891
< 0.1%
119.791
< 0.1%
119.481
< 0.1%
119.431
< 0.1%
119.411
< 0.1%
119.391
< 0.1%
119.321
< 0.1%
119.281
< 0.1%

Interactions

Correlations

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

ATAPAHAFDPGTEPTITTATTEYCDPCONOX
04.58781018.783.6753.575823.9791086.2549.83134.6711.8980.3266381.952
14.29321018.384.2353.570923.9511086.1550.05134.6711.8920.4478482.377
23.90451018.484.8583.582823.9901086.5550.19135.1012.0420.4514483.776
33.74361018.385.4343.580823.9111086.5550.17135.0311.9900.2310782.505
43.75161017.885.1823.578123.9171085.9550.00134.6711.9100.2674782.028
53.88581017.783.9463.582423.9031086.0549.98134.6711.8680.2347381.748
63.66971018.084.1143.580423.8891085.9550.04134.6811.8770.4441284.592
73.58921018.283.8673.577723.8761086.0549.88134.6611.8930.7999684.193
83.71081018.584.9483.602723.9571086.3549.98134.6511.8700.6899683.978
94.82811018.585.3463.515823.4221083.1549.80132.6711.6941.0281082.654

Last rows

ATAPAHAFDPGTEPTITTATTEYCDPCONOX
221817.12671022.485.0644.629236.3411100.5523.17170.4814.7911.406380.600
221827.20221023.680.6944.606936.3991099.6522.05170.5614.8241.426979.918
221836.32391024.788.6334.644036.6571099.6521.08172.0414.8671.670979.344
221845.67771025.392.7044.670836.8031099.8521.11172.5414.8491.529680.540
221855.41581026.182.7184.641736.9501100.0521.10172.9614.8111.441580.553
221864.86311027.081.0844.282534.0451100.0529.98168.3814.2901.253878.397
221874.51731027.480.8134.248133.9041100.1530.47168.0714.3441.080878.251
221884.27171027.980.3804.281734.1651099.9529.56168.5514.3951.047277.269
221894.08531028.678.9074.231333.8021100.1530.61167.9814.3431.087577.985
221904.21481029.470.6794.204933.7681100.0530.97167.3014.2911.133778.950